A five-level static cache architecture for web search engines

نویسندگان

Rifat Ozcan

Ismail Sengör Altingövde

Berkant Barla Cambazoglu

Flavio Paiva Junqueira

Özgür Ulusoy

چکیده

Caching is a crucial performance component of large-scale web search engines, as it greatly helps reducing average query response times and query processing workloads on backend search clusters. In this paper, we describe a multi-level static cache architecture that stores five different item types: query results, precomputed scores, posting lists, precomputed intersections of posting lists, and documents. Moreover, we propose a greedy heuristic to prioritize items for caching, based on gains computed by using items’ past access frequencies, estimated computational costs, and storage overheads. This heuristic takes into account the inter-dependency between individual items when making its caching decisions, i.e., after a particular item is cached, gains of all items that are affected by this decision are updated. Our simulations under realistic assumptions reveal that the proposed heuristic performs better than dividing the entire cache space among particular item types at fixed proportions. 2010 Elsevier Ltd. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling Static Caching in Web Search Engines

In this paper we model a two-level cache of a Web search engine, such that given memory resources, we find the optimal split fraction to allocate for each cache, results and index. The final result is very simple and implies to compute just five parameters that depend on the input data and the performance of the search engine. The model is validated through extensive experimental results and is...

متن کامل

Integrating WWW Caches and Search Engines

In this paper we propose the concept of cache plugins, which are customized programs that run WWW cache servers and perform some of the search engine tasks. We describe a prototype implementation of cache plugin to answer client requests directed to a large search engine, using a nearby cache server to store static objects. Experimental results using actual logs show a signiicant improvement on...

متن کامل

A Cost-Aware Strategy for Query Result Caching in Web Search Engines

Search engines and large scale IR systems need to cache query results for efficiency and scalability purposes. In this study, we propose to explicitly incorporate the query costs in the static caching policy. To this end, a query’s cost is represented by its execution time, which involves CPU time to decompress the postings and compute the query-document similarities to obtain the final top-N a...

متن کامل

Architecture and Design Of High Volume Web Sites

Architecting and designing high volume Web sites has changed immensely over the last six years. These changes include the availability of inexpensive Pentium based servers, Linux, Java applications, commodity switches, connection management and caching engines, bandwidth price reductions, content distribution services, and many others. This paper describes the evolution of the best practices wi...

متن کامل

A machine learning approach for result caching in web search engines

A commonly used technique for improving search engine performance is result caching. In result caching, precomputed results (e.g., URLs and snippets of best matching pages) of certain queries are stored in a fast-access storage. The future occurrences of a query whose results are already stored in the cache can be directly served by the result cache, eliminating the need to process the query us...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Inf. Process. Manage.

دوره 48 شماره

صفحات -

تاریخ انتشار 2012

A five-level static cache architecture for web search engines

نویسندگان

چکیده

منابع مشابه

Modeling Static Caching in Web Search Engines

Integrating WWW Caches and Search Engines

A Cost-Aware Strategy for Query Result Caching in Web Search Engines

Architecture and Design Of High Volume Web Sites

A machine learning approach for result caching in web search engines

عنوان ژورنال:

اشتراک گذاری